-Need reliable way to measure this!
Test common and basic scenarios.
render_basic_scenarios()
Test sensitivity to detour severity
render_detours()
Test resiliance to shifted maps.
render_shifts()
Test resiliance to noisy data (e.g. suitability as match success classifier)
render_noisy()
Test resiliance to subsampled routes
render_sparse()
Test resiliance to start/end point mismatch
render_start_end_mismatch()
Approaches:
| approach | "detour" sensitivity | "bias" sensitivity | range | speed | examples |
|---|---|---|---|---|---|
| distance | high | low | 0-inf | slow/fast | Hausdorff, Frechet, centroids |
| area | low | high | 0-inf | fast | AUC |
| correlation | medium | medium | 0-100% | slow/fast | AUC normalized, CIR, CPR |
| hybrid | high | high | 0-inf | slow | BOC |
Maximum of minimum distance possible between any two pairs of points of the geometry.
Image('_img/hausdorff.png', embed=True)
The Fréchet distance is a measure of similarity between curves that takes into account the location and ordering of the points along the curves.
Image('_img/frechet.png', embed=True, width=500)
Compares phased-in geometry/sequence distance.
Image('_img/dtw.png', embed=True, width=300)
Calculates distance between geometry centroids/centers-of-mass
l1 = test_detour_big_very_short_1.to_linestring(convert_to_utm=True)
l2 = test_detour_big_very_short_2.to_linestring(convert_to_utm=True)
plt.plot(*zip(*l1.coords))
plt.plot(*zip(*l2.coords))
plt.scatter(l1.centroid.x, l1.centroid.y)
plt.scatter(l2.centroid.x, l2.centroid.y)
<matplotlib.collections.PathCollection at 0x7f5b557e3d30>
Calculates area defined between the two geometries/curves.
utm_linestring_1 = test_detour_big_very_short_1.to_linestring(convert_to_utm=True)
utm_linestring_2 = test_detour_big_very_short_2.to_linestring(convert_to_utm=True)
sg.Polygon(list(utm_linestring_1.coords) + list(utm_linestring_2.coords)[::-1])
Calculates area defined between the two geometries/curves and normalizes it by envelope containing both geometries.
utm_linestring_1 = test_detour_big_very_short_1.to_linestring(convert_to_utm=True)
utm_linestring_2 = test_detour_big_very_short_2.to_linestring(convert_to_utm=True)
polygon = sg.Polygon(list(utm_linestring_1.coords) + list(utm_linestring_2.coords)[::-1])
plt.plot(*zip(*list(polygon.envelope.boundary.coords)))
plt.plot(*zip(*list(polygon.boundary.coords)))
[<matplotlib.lines.Line2D at 0x7f0c000ed780>]
Geometric mean of max distance and total AUC.
l1 = test_detour_big_very_short_1.to_linestring(convert_to_utm=True)
l2 = test_detour_big_very_short_2.to_linestring(convert_to_utm=True)
plt.plot(*zip(*l1.coords))
plt.plot(*zip(*l2.coords))
plt.scatter(l1.centroid.x, l1.centroid.y)
plt.scatter(l2.coords[13][0] + 85, l2.coords[13][1] - 28)
<matplotlib.collections.PathCollection at 0x7f0bbcabf940>
% of points close enough to other geometry
corridor_1 = utm_linestring_1.buffer(10)
corridor_2 = utm_linestring_2.buffer(10)
display('All points:')
display(corridor_1.union(corridor_2))
display('Point matches:')
display(corridor_1.intersection(corridor_2))
'All points:'
'Point matches:'
Minimum number of edits required to transform encoded polyline of one to the other.
Image('_img/levensthein.png', embed=True, width=400)
configs = {
'hausdorff': {'name': 'hausdorff', 'params': {}},
# TOO SLOW 'frechet': {'name': 'frechet', 'params': {}},
# TOO SLOW 'dtw': {'name': 'dtw', 'params': {}},
'centroids': {'name': 'centroids', 'params': {}},
'auc': {'name': 'auc', 'params': {}},
'auc_norm': {'name': 'auc', 'params': {'normalized': True}},
'bocs': {'name': 'bocs', 'params': {}},
'pmr': {'name': 'pmr', 'params': {}},
'pmr_lenient': {'name': 'pmr', 'params': {'buffer': 50}},
'pmr_strict': {'name': 'pmr', 'params': {'buffer': 3}},
'levensthein': {'name': 'levenshtein', 'params': {}},
}
tests = sorted(glob.glob('data/*.json'))
results = run_tests(tests, configs)
df = pd.DataFrame(results).transpose()
Score correlation:
sns.heatmap(df.corr())
<matplotlib.axes._subplots.AxesSubplot at 0x7f0bbc54b390>
100 tests, Berlin coverage map:
results_df = pd.read_csv('results/routes_comparison.csv')
plot_all(results_df)
Results:
create_debug_artifacts(results_df)
/home/pedro/.local/share/virtualenvs/route-quality-u_irGnMv/lib/python3.7/site-packages/matplotlib/cbook/deprecation.py:107: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance. warnings.warn(message, mplDeprecation, stacklevel=1)
Package: route_quality.metrics.geometry
Script: geometry-comparison.py
analyze_results()
| score_compare_hausdorff | score_compare_cir | score_compare_pmr | |
|---|---|---|---|
| count | 100.000000 | 100.000000 | 100.000000 |
| mean | 700.722069 | 0.567035 | 0.711785 |
| std | 896.430325 | 0.277168 | 0.277051 |
| min | 4.388269 | 0.000950 | 0.004310 |
| 25% | 145.836679 | 0.347873 | 0.546644 |
| 50% | 466.416557 | 0.646440 | 0.791475 |
| 75% | 830.610520 | 0.814544 | 0.944706 |
| max | 4937.005377 | 0.932101 | 1.000000 |
compare_distributions(metrics, 'length', 'Length [km]', palette=palette)
compare_distributions(metrics, 'eta', 'ETA [min]', palette=palette)
compare_distributions(metrics, 'speed', 'Speed [km/h]', palette=palette)
df['das-traffic'] = df['speed_das-traffic'] - df['speed_google']
df['das'] = df['speed_das'] - df['speed_google']
vs = pd.melt(df[['route_id', 'das-traffic', 'das']], id_vars=['route_id'], var_name='provider', value_vars=['das-traffic', 'das'], value_name='speed_diff')
compare_distributions(vs, 'speed_diff', 'Speed diff. to Google [km/h]', palette=palette)
f = plt.figure()
sns.lmplot(data=metrics.drop(index=24).sort_values('speed', ascending=True),
x='length', y='eta', hue='provider', palette=palette, markers='.')
plt.xlabel('Duration [km]')
plt.ylabel('ETA [min]')
Text(31.5515,0.5,'ETA [min]')
<Figure size 432x288 with 0 Axes>